Goto

Collaborating Authors

 block size



PrivCirNet: Efficient Private Inference via Block Circulant Transformation

Neural Information Processing Systems

Homomorphic encryption (HE)-based deep neural network (DNN) inference protects data and model privacy but suffers from significant computation overhead. We observe transforming the DNN weights into circulant matrices converts general matrix-vector multiplications into HE-friendly 1-dimensional convolutions, drastically reducing the HE computation cost.




Blockwise Parallel Decoding for Deep Autoregressive Models

Mitchell Stern, Noam Shazeer, Jakob Uszkoreit

Neural Information Processing Systems

To overcome this limitation, we propose a novel blockwise parallel decoding scheme in which we makepredictions for multiple time steps inparallel then back offtothe longest prefix validated byascoring model.